The fusion of visual lip movements and mixed speech signals for robust speech separation

نویسندگان

  • Parham Aarabi
  • Bobji Mungamuru
چکیده

A technique for the early fusion of visual lip movements and a vector of mixed speech signals is proposed. This technique involves the initial recreation of speech signals entirely from the visual lip motions of each speaker. By using geometric parameters of the lips obtained from the Tulips1 database and the Audio-Visual Speech Processing dataset, a virtual speech signal is recreated by using audiovisual training segments as a basis for the recreation. It is shown that the visually created speech signal has an envelope that is directly related to the envelope of the original acoustic signal. This visual signal envelope reconstruction is then used to aid in the robust separation of the mixed speech signals by using the envelope information to identify the vocally active and silent periods of each speaker. It is shown that, unlike previous signal separation techniques, which required an ideal mixture of independent signals, the mixture coefficients can be very accurately estimated using the proposed technique in even non-ideal situations. For example, in the presence of speech noise, the mixing coefficients can be correctly estimated with signal-to-noise ratios (SNRs) as low as 0 dB, while in the presence of Gaussian noise, the estimation can be accurately done with SNRs as low as 10 dB.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

کاربرد الگوریتم جداسازی کور منابع در جداسازی سیگنال‌های گفتار و موسیقی

In this paper, the application of the Independent Component Analysis In this paper, the application of the Independent Component Analysis technique in speech-music separation is discussed. The separation algorithm is in the time domain. It needs the score function estimation to minimize the mutual information. For estimating score function, sufficient samples of the mixed (speech-music) signals...

متن کامل

Speech extraction based on ICA and audio-visual coherence

We present a new approach to the source separation problem for multiple speech signals. Using the extra visual information of the speaker’s face, the method aims to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker’s lip movements. We define a statistical model of the joint probability of visual and spectral audio input for quantifying th...

متن کامل

Extracting an AV speech source f

We present a new approach to the source separation problem for multiple speech signals. Using the extra visual information of the face speaker, the method aims to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker’s lip movements. We define a statistical model of the joint probability of visual and spectral audio input for quantifying the ...

متن کامل

Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility

During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker's lip movements and demonstrated significant entrainm...

متن کامل

Audio-Visual Speech Modeling for Continuous Speech Recognition

This paper describes a speech recognition system that uses both acoustic and visual speech information to improve the recognition performance in noisy environments. The system consists of three components: 1) a visual module; 2) an acoustic module; and 3) a sensor fusion module. The visual module locates and tracks the lip movements of a given speaker and extracts relevant speech features. This...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Information Fusion

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2004